Direct mapped cache performance modeling for sparse matrix operations
نویسندگان
چکیده
Sparse matrices are in the kernel of numerical applications. Their compressed storage, which permits both operations and memory savings, generates irregular access patterns, reducing the performance of the memory hierarchy. In this work we present a probabilistic model for the prediction of the number of misses of a direct mapped cache memory, considering sparse matrices with a uniform entries distribution. The number of misses is directly related to the program execution time and the memory hierarchy performace. The model considers the three types of standard interferences: intrinsic, self and cross interferences. We explain in detail the modeling of a representative matrix operation such as the sparse matrix-dense matrix product, considering several loop orderings, and include validation results that show the model accuracy.
منابع مشابه
Performance Optimization and Evaluation for Linear Codes
In this paper, we develop a probabilistic model for estimation of the numbers of cache misses during the sparse matrix-vector multiplication (for both general and symmetric matrices) and the Conjugate Gradient algorithm for 3 types of data caches: direct mapped, s-way set associative with random or with LRU replacement strategies. Using HW cache monitoring tools, we compare the predicted number...
متن کاملCache Misses Prediction for High Performance Sparse Algorithms
Many scientiic applications handle compressed sparse matrices. Cache behavior during the execution of codes with irregular access patterns, such as those generated by this type of matrices, has not been widely studied. In this work a probabilistic model for the prediction of the number of misses on a direct mapped cache memory considering sparse matrices with an uniform distribution is presente...
متن کاملAnalytical Modeling of Optimized Sparse Linear Code
In this paper, we describe source code transformations based on sw-pipelining, loop unrolling, and loop fusion for the sparse matrix-vector multiplication and for the Conjugate Gradient algorithm that enable data prefetching and overlapping of load and FPU arithmetic instructions and improve the temporal cache locality. We develop a probabilistic model for estimation of the numbers of cache mis...
متن کاملCache Oblivious Dense and Sparse Matrix Multiplication Based on Peano Curves
Cache oblivious algorithms are designed to benefit from any existing cache hierarchy—regardless of cache size or architecture. In matrix computations, cache oblivious approaches are usually obtained from block-recursive approaches. In this article, we extend an existing cache oblivious approach for matrix operations, which is based on Peano space-filling curves, for multiplication of sparse and...
متن کاملModeling Set Associative Caches Behaviour for Irregular Computations
While much work has been devoted to the study of cache behavior during the execution of codes with regular access patterns, little attention has been paid to irregular codes. An important portion of these codes are scientiic applications that handle compressed sparse matrices. In this work a probabilistic model for the prediction of the number of misses on a K-way associative cache memory consi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999